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Abstract 



1994D (semi) automatic methods for associating 



This paper explores the automatic construction 
of a multilingual Lexical Knowledge Base from 
pre-existing lexical resources. We present a new 
^nd robust npprnnch for Imkmg already existing tologies are described in (|Utiyama and Hasida. 



a Japanese lexicon to an English ontology us- 
ing a bilingual dictionary are described. Sev- 
eral experiments aligning edr and WordNet on- 



levir.al/semRntir hierarrhies. We used a rnn- 

straint satisfaction algorithm (relaxation label- 
ing) to select -among all the candidate trans- 
lations proposed by a bilingual dictionary- the 
right English WordNet synset for each sense in 
a taxonomy automatically derived from a Span- 
ish monolingual dictionary. Although on aver- 
age, there are 15 possible WordNet connections 
for each sense in the taxonomy, the method 
achieves an accuracy over 80%. Finally, we also 
propose several ways in which this technique 
could be applied to enrich and improve exist- 
ing lexical databases. 

1 Introduction 

There is an increasing need of having available 
general, accurate and broad coverage multilin- 
gual lexical/semantic resources for developing 
NL applications. Thus, a very active field in- 
side NL during the last years has been the fast 
development of generic language resources. 
Several attempts have been performed to pro- 



duce multilingual ontologies. In ( |Ageno et al. 



1994), a Spanish/English bilingual dictionary is 



used to (semi) automatically link Spanish and 
English taxonomies extracted from DGILE (Al- 
var, 1987| ) and ldoce ( Procter, 19871 ). Sim- 



ilarly, a simple automatic approach for link- 
ing Spanish taxonomies extracted from DGILE 
to WordNet ([Miller et al., 1991]) synsets is 

proposed in ([Rigau et al., 1995 ). The work 

reported in ( Knight and Luk, 1994 ) focuses 
on the construction of Sensus, a large knowl- 
edge base for supporting the Pangloss machine 
translation system. In (Okumura and Hovy, 



1997). Several lexical resources and techniques 
are combined in ( [Atserias et al., 1997 ) to map 



Spanish words from a bilingual dictionary to 
WordNet, and in ([Farreres et al., 1998[ ) the 
use of the taxonomic structure derived from a 
monolingual MRD is proposed as an aid to this 
mapping process. 

This paper presents a novel approach for 
merging already existing hierarchies. The 
method has been applied to attach substan- 
tial fragments of the Spanish taxonomy derived 
from DGILE ([Rigau et al., 199^ ) to the Enghsh 
WordNet using a bilingual dictionary for con- 
necting both hierarchies. 

This paper is organized as follows: In section 
^ we describe the used technique (the relaxation 
labeling algorithm) and its application to hier- 
archy mapping. In section H we describe the 
constraints used in the relaxation process, and 
finally, after presenting some experiments and 
preliminary results, we offer some conclusions 
and outline further lines of research. 

2 Application of Relaxation 
Labeling to NLP 

Relaxation labeling (RL) is a generic name for 
a family of iterative algorithms which perform 
function optimization, based on local informa- 
tion. See ( [Terras, 1989 ) for a summary. Its 
most remarkable feature is that it can deal with 
any kind of constraints, thus, the model can be 
improved by adding any constraints available, 
and the algorithm is independent of the com- 
plexity of the model. That is, we can use more 
sophisticated constraints without changing the 



algorithm. 

The algorithm has been applied to POS tag- 
ging (^arquez and Padro, 1997|), shallow pars- 



no more changes, although more sophisti- 
cated heuristic procedures may also be used 
to stop relaxation processes ([Eklundh and 



ing (iVoutilainen and Padro, 19971) and to word Rosenfeld, 1978|; [Richards et al., 1981 ) 



sense disambiguation ( Padro, 19981 ) 

Although other function optimization algo- 
rithms could have been used (e.g. genetic algo- 
rithms, simmulated annealing, etc.), we found 
RL to be suitable to our purposes, given its abil- 
ity to use models based on context constraints, 
and the existence of previous work on applying 
it to NLP tasks. 

Detailed explanation of the algorithm can be 
found in ( p?orras, 19891) , while its application 
to NLP tasks, advantages and drawbacks are 
addressed in ( Padro, 199 j ). 



2.1 Algorithm Description 

The Relaxation Labeling algorithm deals with 
a set of variables (which may represent words, 
synsets, etc.), each of which may take one 
among several different labels (pos tags, senses, 
MRD entries, etc.). There is also a set of con- 
straints which state compatibility or incompat- 
ibility of a combination of pairs variable-label. 

The aim of the algorithm is to find a weight 
assignment for each possible label for each vari- 
able, such that (a) the weights for the labels of 
the same variable add up to one, and (b) the 
weight assignation satisfies -to the maximum 
possible extent- the set of constraints. 

Summarizing, the algorithm performs con- 
straint satisfaction to solve a consistent labeling 
problem. The followed steps are: 

1. Start with a random weight assignment. 

2. Compute the support value for each label 
of each variable. Support is computed ac- 
cording to the constraint set and to the cur- 
rent weights for labels belonging to context 
variables. 

3. Increase the weights of the labels more 
compatible with the context (larger sup- 
port) and decrease those of the less com- 
patible labels (smaller support). Weights 
are changed proportionally to the support 
received from the context. 

4. If a stopping/convergence criterion is sat- 
isfied, stop, otherwise go to step 2. We 
use the criterion of stopping when there are 



The cost of the algorithm is proportional to 
the product of the number of variables by the 
number of constraints. 

2.2 Application to taxonomy mapping 

As described in previous sections, the problem 
we are dealing with is to map two taxonomies. 
That is: 



The starting point is a sense disam- 
biguated Spanish taxonomy -automatically 
extracted from a monolingual dictionary 
(IRigau et al., 1998| )-. 



• We have a conceptual taxonomy (e.g. 
WordNet ([Miller et al., 1991| )), in which 
the nodes represent concepts, organized as 
synsets. 

• We want to relate both taxonomies in order 
to have an assignation of each sense of the 
Spanish taxonomy to a WN synset. 

The modeling of the problem is the following: 

• Each sense in the Spanish taxonomy is a 
variable for the relaxation algorithm. 

• The possible labels for that variable, are 
all the WN synsets which contain a word 
that is a possible translation of the Span- 
ish sense. Thus, we will need a bilingual 
dictionary to know all the possible trans- 
lations for a given Spanish word. This has 
the effect of losing the sense information we 
had in the Spanish taxonomy. 

• The algorithm will need constraints stating 
whether a synset is a suitable assignment 
for a sense. These constraints will rely on 
the taxonomy structure. Details are given 
in section y. 

3 The Constraints 

Constraints are used by relaxation labeling al- 
gorithm to increase or decrease the weight for a 
variable label. In our case, constraints increase 
the weights for the connections between a sense 
in the Spanish taxonomy and a WordNet synset. 
Increasing the weight for a connection implies 



decreasing the weights for all the other possi- 
ble connections for the same node. To increase 
the weight for a connection, constraints look for 
already connected nodes that have the same re- 
lationships in both taxonomies. 

Although there is a wide range of relation- 
ships between WordNet synsets which can be 
used to build constraints, we have focused on 
the hyper/hyponym relationships. That is, we 
increase the weight for a connection when the 
involved nodes have hypernyms/hyponyms also 
connected. We consider hyper/hyponym rela- 
tionships either directly or indirectly (i.e. an- 
cestors or descendants), depending on the kind 
of constraint used. 

Figure |l| shows an example of possible con- 
nections between two taxonomies. Connection 
C4 will have its weight increased due to C5, Cq 
and Ci, while connections C2 and C3 will have 
their weights decreased. 
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Figure 1: Example of connections between tax- 
onomies. 

Constraints are coded with three characters 
XYZ, which are read as follows: The last char- 
acter, z, indicates whether the constraints re- 
quires the existence of a connected hypernym 
(e), hyponym (o), or both (b). The two first 
characters indicate how the hyper/hyponym re- 
lationship is considered in the Spanish taxon- 
omy (character x) and in WordNet (charac- 
ter y): (l) indicates that only immediate hy- 
per/hyponym match, and (a) indicates that any 
ancestor /descendant matches. 

Thus, we have constraints iie/iio which in- 
crease the weight for a connection between a 
Spanish sense and a WordNet synset when there 
is a connection between their respective hyper- 
nyms/hyponyms. Constraint IIB requires the si- 
multaneous satisfaction of he and iio. 



Similarly, we have constraints iae/iao, which 
increase the weight for a connection between a 
Spanish sense and a WordNet synset when there 
is a connection between the immediate hyper- 
nym/hyponym of the Spanish sense and any an- 
cestor/descendant of the WN synset. Constraint 
lAB requires the simultaneous satisfaction of lAE 
and lAO. Symmetrically, constraints AIE, AIO 
and AIB, admit recursion on the Spanish taxon- 
omy, but not in WordNet. 

Finally, constraints AAE, AAO and AAB, admit 
recursion on both sides. 

For instance, the following example shows a 
taxonomy in which the IIE constraint would 
be enough to connect the Spanish node ra- 
paz to the <bird-oLprey> synset, given that 
there is a connection between ave (hypernym 
of rapaz) and animal <hird> (hypernym of 
< bird-oLprey> ) . 

animal ^=^(Tops <animal,animate-heing,...>) 
^=^(person <beast,brute,...>) 
^=^(person <dunce,blockhead,...>) 
ave ^^(animal <bird>) 

^=^ (animal <fowl,poultry,...>) 
^=>(artifact <bird,shuttle,...>) 
^^(food <fowl,poultry,...>) 
^^(person <dame,doU,...>) 
faisan ^^(animal <pheasant>) 

^^(food <pheasant>) 
rapaz ^=^ (animal <bird-oLprey,...>) 
^=^(person <cub,lad,...>) 
^=^(person <chap,feUow,. . .>) 
=^ (person <lass,young_girl, ...>) 

Constraint he would -wrongly- connect the 
Spanish sense faisan to the food <pheasant> 
synset, since there is a connection between its 
immediate hypernym (ave) and the immedi- 
ate hypernym food <pheasant> (which is food 
<fowl,poultry,...>), but the animal synsets for 
ave are non-immediate ancestors of the animal 
synsets for <pheasant>. This would be rightly 
solved when using lAE or AAE constraints. 

More information on constraints and their ap- 
plication can be found in ( Daude et al., 1999| ). 



4 Experiments and Results 

In this section we will describe a set of experi- 
ments and the results obtained. A brief descrip- 
tion of the used resources is included to set the 
reader in the test environment. 



4.1 Spanish Taxonomies 

We tested the relaxation labeling algorithm 
with the described constraints on a set of 
disambiguated Spanish taxonomies automat- 
ically acquired from monolingual dictionar- 
ies. These taxonomies were automatically as- 



signed to a WordNet semantic file (Rigau et 
al., 19971 ; |Rigau et al., 19981) . We tested 



the performance of the method on two dif- 
ferent kinds of taxonomies: those assigned 
to a well defined and concrete semantic files 
(noun. animal, noun. food), and those assigned 
to more abstract and less structured ones 
(noun, cognition and noun, communication). 

We performed experiments directly on the 
taxonomies extracted by ( [Rigau et al., 1997 ), 
as well as on slight variations of them. Namely, 
we tested on the following modified taxonomies: 

-|-top Add a new virtual top as an hypernym 
of all the top nodes of taxonomies belong- 
ing to the same semantic file. The virtual 
top is connected to the top synset of the 
WordNet semantic file. In this way, all the 
taxonomies assigned to a semantic file, are 
converted to a single one. 

no-senses The original taxonomies were built 
taking into account dictionary entries. 
Thus, the nodes are not words, but dictio- 
nary senses. This test consists of collaps- 
ing together all the sibling nodes that have 
the same word, regardless of the dictionary 
sense they came from. This is done as an 
attempt to minimize the noise introduced 
at the sense level by the taxonomy building 
procedure. 

4.2 Bilingual dictionaries 

The possible connections between a node in the 
Spanish taxonomy and WN synsets were ex- 
tracted from bilingual dictionaries. Each node 
has as candidate connections all the synsets for 
all the words that are possible translations for 
the Spanish word, according to the bilingual 
dictionary. Although the Spanish taxonomy 
nodes are dictionary senses, bilingual dictionar- 
ies translate words. Thus, this step introduces 
noise in the form of irrelevant connections, since 
not all translations necessarily hold for a single 
dictionary sense. 

We used an integration of several bilingual 
sources available. This multi-source dictionary 



contains 124,949 translations (between 53,830 
English and 41,273 Spanish nouns). 

Since not all words in the taxonomy appear in 
our bilingual dictionaries, coverage will be par- 
tial. Table |I| shows the percentage of nodes in 
each taxonomy that appear in the dictionaries 
(and thus, that may be connected to WN). 

Among the words that appear in the bilingual 
dictionary, some have only one candidate con- 
nection -i.e. are monosemous-. Since selecting 
a connection for these cases is trivial, we will fo- 
cus on the polyseniic nodes. Table |2| shows the 
percentage of polysemic nodes (over the num- 
ber of words with bilingual connection) in each 
test taxonomy. The average polysemy ratio 
(number of candidate connections per Spanish 
sense) is 15.8, ranging from 9.7 for taxonomies 
in noun. animal, to 20.1 for less structured do- 
mains such as noun. communication. 

4.3 Results 

In the performed tests we used simultaneously 
all constraints with the same recursion pattern. 
This yields the packs: ii*, Ai*, lA* and AA*, 
which were applied to all the taxonomies for the 
four test semantic files. 

Table |3| presents coverage figures for the dif- 
ferent test sets, computed as the amount of 
nodes for which some constraint is applied and 
thus their weight assignment is changed. Per- 
centage is given over the total amount of nodes 
with bilingual connections. 

To evaluate the precision of the algorithm, we 
hand checked the results for the original tax- 
onomies, using AA* constraints. Precision re- 
sults can be divided in several cases, depending 
on the correctness of the Spanish taxonomies 
used as a starting point. 

Tqk^Fok The Spanish taxonomy was well 
built and correctly assigned to the semantic 
file. 

ToK, Fnok The Spanish taxonomy was well 
built, but wrongly assigned to the semantic 
file. 

Tnok The Spanish taxonomy was wrongly 
built. 

In each case, the algorithm selects a connec- 
tion for each sense, we will count how many 
connections are right /wrong in the first and sec- 
ond cases. In the third case the taxonomy was 





original 


+top 


no-senses 


noun . animal 


45% 


45% 


43% 


noun. food 


55% 


56% 


52% 


noun. cognition 


54% 


55% 


52% 


noun. communication 


66% 


66% 


64% 



Table 1: Percentage of nodes with bilingual connection in each test taxonomy. 
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no-senses 


noun 


animal 


77% 


77% 


75% 


noun 


food 


81% 


81% 


79% 


noun 


cognition 


74% 


74% 


72% 


noun 


communication 


87% 


87% 


86% 



Table 2: Percentage of nodes with more than one candidate connection. 



wrongly extracted and is nonsense, so the assig- 
nations cannot be evaluated. 

Note that we can distinguish right/wrong 
assignations in the second case because the con- 
nections are taken into account over the whole 
WN, not only on the semantic file being pro- 
cessed. So, the algorithm may end up correctly 
assigning the words of a hierarchy, even when 
it was assigned to the wrong semantic file. For 
instance, in the hierarchy 

piel {skin, fur, peel, pelt) 

^^marta {sable, marten, coaLhack) 
^^vison {mink, mink_coat) 

all words may belong either to the semantic file 
noun. substance (senses related to fur, pelt) or 
to noun. animal {animal, animaLpart senses), 
among others. The right noun, substance 
synsets for each word are selected, since there 
was no synset for piel that was ancestor of the 
animal senses of marta and vison. 

In this case, the hierarchy was well built, and 
well solved by the algorithm. The only mistake 
was having assigned it to the noun . animal se- 
mantic file, so we will count it as a right choice 
of the relaxation labeling algorithm, but write 
it in a separate column. 

Tables Q and ^ show the precision rates for 
each original taxonomy. In the former, fig- 
ures are given over polysemic words (nodes with 
more than one candidate connection). In the 
later, figures are computed overall (nodes with 
at least one candidate connection). 

Accuracy is computed at the semantic file 
level, i.e., if a word is assigned a synset of the 



right semantic file, it is computed as right, oth- 
erwise, as wrong. 

To give an idea of the task complexity and the 
quality of the reported results, even with this 
simplified evaluation, consider the following: 

• Those nodes with only one possible synset 
for the right semantic file (30% in average, 
ranging from 22% in noun, communication 
to 45% in noun . animal) are not affected 
by the evaluation at the semantic file level. 

• The remaining nodes have more than 
one possible synset in the right se- 
mantic file: 6.3 in average (ranging 
from 3.0 for noun. animal to 8.7 for 
noun. communication). 

• Thus, we can consider that we are eval- 
uating a task easier than the actual one 
(the actual evaluation would be performed 
at the synset level). This simplified task 
has an average polysemy of 6.7 possible 
choices per sense, while the actual task at 
the synset level would have 15.8. Although 
this situates the baseline of a random as- 
signment about 15% instead of 6%, it is 
still a hard task. 

5 Conclusions 

We have applied the relaxation labeling algo- 
rithm to assign an appropriate WN synset to 
each node of an automatically extracted tax- 
onomy. Results for two different kinds of con- 
ceptual structures have been reported, and they 
point that this may be an accurate and robust 
method (not based on ad-hoc heuristics) to con- 
nect hierarchies (even in different languages). 



WNfile 


taxonomy 


II* 


AI* 


lA* 


AA* 


noun . animal 


original 

+top 

no-senses 


134 (23%) 
138 (24%) 
118 (23%) 


135 (23%) 
143 (25%) 
119 (20%) 


357 (62%) 
375 (65%) 
311 (61%) 


365 (63%) 
454 (78%) 
319 (62%) 


noun. food 


original 

+top 

no-senses 


119 (36%) 
134 (40%) 
102 (36%) 


130 (39%) 
158 (47%) 
111 (39%) 


164 (49%) 
194 (58%) 
153 (51%) 


180 (63%) 
259 (77%) 
156 (55%) 


noun. cognition 


original 

-l-top 

no-senses 


225 (37%) 
230 (38%) 
192 (37%) 


230 (38%) 
240 (40%) 
197 (38%) 


360 (60%) 
395 (65%) 
306 (59%) 


373 (62%) 
509 (84%) 
318 (61%) 


noun. communication 


original 

-l-top 

no-senses 


552 (43%) 
589 (46%) 
485 (43%) 


577 (45%) 
697 (54%) 
509 (45%) 


737 (57%) 
802 (62%) 
645 (57%) 


760 (59%) 

1136 (88%) 

668 (59%) 



Table 3: Coverage of each constraint set for different test sets. 



precision over precision over total precision 
ToK,FoK Tok,Fnok over To k 



animal 
food 

cognition 
communication 



279 (90%) 
166 (94%) 
198 (67%) 
533 (77%) 



30 (91%) 
3 (100%) 
27 (90%) 
40 (97%) 



309 (90%) 
169 (94%) 
225 (69%) 
573 (78%) 




Table 4: Precision results over polysemic words for the test taxonomies. 



The experiments performed up to now seem 
to indicate that: 

• The relaxation labeling algorithm is a good 
technique to link two different hierarchies. 
For each node with several possible connec- 
tions, the candidate that best matches the 
surrounding structure is selected. 

• The only information used by the algorithm 
are the hyper/hyponymy relationships in 
both taxonomies. These local constraints 
are propagated throughout the hierarchies 
to produce a global solution. 

• There is a certain amount of noise in the 
different phases of the process. First, the 
taxonomies were automatically acquired 
and assigned to semantic files. Second, the 
bilingual dictionary translates words, not 
senses, which introduces irrelevant candi- 
date connections. 

• The size and coverage of the bilingual dic- 
tionaries used to establish the candidate 
connections is an important issue. A dic- 
tionary with larger coverage increases the 
amount of nodes with candidate connec- 
tions and thus the algorithm coverage 



6 Proposals for Further Work 

Some issues to be addressed to improve the al- 
gorithm performance are the following: 

• Further test and evaluate the precision of 
the algorithm. In this direction we plan 
-apart from performing wider hand check- 
ing of the results, both to file and synset 
level- to use the presented technique to link 
WN1.5 with WN1.6. Since there is already 
a mapping between both versions, the ex- 
periment would provide an idea of the ac- 
curacy of the technique and of its applica- 
bility to different hierarchies of the same 
language. In addition, it would constitute 
an easy way to update existing lexical re- 
sources. 

• Use other relationships apart from hy- 
per/hyponymy to build constraints to se- 
lect the best connection (e.g. sibling, 
cousin, synonymy, meronymy, etc.). 

• To palliate the low coverage of the bilingual 
dictionaries, candidate translations could 
be inferred from connections of surround- 
ing senses. For instance, if a sense has no 
candidate connections, but its hypernym 



precision over precision over total precision 
ToK,FoK Tok,Fnok over To k 



animal 
food 

cognition 
communication 



424 (93%) 
166 (94%) 
200 (67%) 
536 (77%) 



62 (95%) 
83 (100%) 
245 (99%) 
234 (99%) 



486 (93%) 
149 (96%) 
445 (82%) 
760 (81%) 



Table 5: Precision results over all words for the test taxonomies. 



does, we could consider as candidate con- 
nections for that node all the hyponynis of 
the synset connected to its hypernym. 

• Use the algorithm to enrich the Spanish 
part of EuroWordNet taxonomy. It could 
also be applied to include taxonomies for 
other languages not currently in the EWN 
project. 

In addition, some ideas to further exploit the 
possibilities of these techniques are: 

• Use EWN instead of WN as the target tax- 
onomy. This would largely increase the 
coverage, since the candidate connections 
missing in the bilingual dictionaries could 
be obtained from the Spanish part of ewn, 
and viceversa. In addition, it would be use- 
ful to detect gaps in the Spanish part of 
EWN, since a EWN synset with no Spanish 
words in ewn, could be assigned one via 
the connections obtained from the bilingual 
dictionaries. 

• Since we are connecting dictionary senses 
(the entries in the mrd used to build the 
taxonomies) to EWN synsets: First of all, 
we could use this to disambiguate the right 
sense for the genus of an entry. For in- 
stance, in the Spanish taxonomies, the 
genus for the entry quesoA (cheese) is masa 
(mass) but this word has several dictio- 
nary entries. Connecting the taxonomy 
to EWN, we would be able to find out 
which is the appropriate sense for masa, 
and thus, which is the right genus sense for 
quesoA. Secondly, once we had each dic- 
tionary sense connected to a ewn synset, 
we could enrich ewn with the definitions 
in the MRD, using them as Spanish glosses. 

• Map the Spanish part of ewn to wn1.6. 
This could be done either directly, or via 
mapping WNl.5— WNl.6. 
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